Optimizing Data - Parallel Stencil

نویسندگان

Stephen W. Chappelow

Philip J. Hatcher

James R. Mason

چکیده

We have developed a communication optimizer that concentrates on stencil communication patterns. This optimizer has been done in the context of the UNH C* compiler that targets distributed-memory MIMD computers. Our work has two distinguishing features: The compiler/optimizer is designed to be highly portable. We achieve this goal by providing eecient support for the optimizations in the run-time library. As well as performing aggregation for messages that share the same source and destination, we employ a specialized store-and-forward protocol that reduces the total number of messages initiated.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Diamond Tiling: A Tiling Framework for Time-iterated Scientific Applications

This paper fully develops Diamond Tiling, a technique to partition the computations of stencil applications such as FDTD. The Diamond Tiling technique is the result of optimizing the amount of useful computations that can be executed when a region of memory is loaded to the local memory of a multiprocessor chip. Diamond Tiling contributes to the state of the art on time tiling techniques in tha...

متن کامل

Optimizing Transformations of Stencil Operations for Parallel Cache-based Architectures

This paper describes a new technique for optimizing serial and parallel stencil-and stencil-like operations for cache-based architectures. This technique takes advantage of the semantic knowledge implicitly in stencil-like computations. The technique is implemented as a source-to-source program transformation; because of its speci-city it could not be expected of a conventional compiler. Empiri...

متن کامل

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines Citation

Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Because of their complex structure, the performance difference between a naive implementation of a pipeline and an optimized one is often an order of ...

متن کامل

Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model

Our aim is to apply program transformations to stencil codes in order to yield the highest possible performance. We recognize memory bandwidth as a major limitation in stencil code performance. We conducted a study in which we applied optimizing transformations to two Jacobi smoother kernels: one 3D 1st-order 7-point stencil and one 3D 3rd-order 19-point stencil. To obtain high performance, the...

متن کامل

Multicore-optimized wavefront diamond blocking for optimizing stencil updates

The importance of stencil-based algorithms in computational science has focused attention on optimized parallel implementations for multilevel cache-based processors. Temporal blocking schemes leverage the large bandwidth and low latency of caches to accelerate stencil updates and approach theoretical peak performance. A key ingredient is the reduction of data traffic across slow data paths, es...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1995

Optimizing Data - Parallel Stencil

نویسندگان

چکیده

منابع مشابه

University of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Diamond Tiling: A Tiling Framework for Time-iterated Scientific Applications

Optimizing Transformations of Stencil Operations for Parallel Cache-based Architectures

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines Citation

Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model

Multicore-optimized wavefront diamond blocking for optimizing stencil updates

عنوان ژورنال:

اشتراک گذاری